71 research outputs found
Optimal Pose and Shape Estimation for Category-level 3D Object Perception
We consider a category-level perception problem, where one is given 3D sensor
data picturing an object of a given category (e.g. a car), and has to
reconstruct the pose and shape of the object despite intra-class variability
(i.e. different car models have different shapes). We consider an active shape
model, where -- for an object category -- we are given a library of potential
CAD models describing objects in that category, and we adopt a standard
formulation where pose and shape estimation are formulated as a non-convex
optimization. Our first contribution is to provide the first certifiably
optimal solver for pose and shape estimation. In particular, we show that
rotation estimation can be decoupled from the estimation of the object
translation and shape, and we demonstrate that (i) the optimal object rotation
can be computed via a tight (small-size) semidefinite relaxation, and (ii) the
translation and shape parameters can be computed in closed-form given the
rotation. Our second contribution is to add an outlier rejection layer to our
solver, hence making it robust to a large number of misdetections. Towards this
goal, we wrap our optimal solver in a robust estimation scheme based on
graduated non-convexity. To further enhance robustness to outliers, we also
develop the first graph-theoretic formulation to prune outliers in
category-level perception, which removes outliers via convex hull and maximum
clique computations; the resulting approach is robust to 70%-90% outliers. Our
third contribution is an extensive experimental evaluation. Besides providing
an ablation study on a simulated dataset and on the PASCAL3D+ dataset, we
combine our solver with a deep-learned keypoint detector, and show that the
resulting approach improves over the state of the art in vehicle pose
estimation in the ApolloScape datasets
Loc-NeRF: Monte Carlo Localization using Neural Radiance Fields
We present Loc-NeRF, a real-time vision-based robot localization approach
that combines Monte Carlo localization and Neural Radiance Fields (NeRF). Our
system uses a pre-trained NeRF model as the map of an environment and can
localize itself in real-time using an RGB camera as the only exteroceptive
sensor onboard the robot. While neural radiance fields have seen significant
applications for visual rendering in computer vision and graphics, they have
found limited use in robotics. Existing approaches for NeRF-based localization
require both a good initial pose guess and significant computation, making them
impractical for real-time robotics applications. By using Monte Carlo
localization as a workhorse to estimate poses using a NeRF map model, Loc-NeRF
is able to perform localization faster than the state of the art and without
relying on an initial pose estimate. In addition to testing on synthetic data,
we also run our system using real data collected by a Clearpath Jackal UGV and
demonstrate for the first time the ability to perform real-time global
localization with neural radiance fields. We make our code publicly available
at https://github.com/MIT-SPARK/Loc-NeRF
Transport of topologically protected photonic waveguide on chip
We propose a new design on integrated optical devices on-chip with an extra
width degree of freedom by using a photonic crystal waveguide with Dirac points
between two photonic crystals with opposite valley Chern numbers. With such an
extra waveguide, we demonstrate numerically that the topologically protected
photonic waveguide keeps properties of valley-locking and immunity to defects.
Due to the design flexibility of the width-tunable topologically protected
photonic waveguide, many unique on-chip integrated devices have been proposed,
such as energy concentrators with a concentration efficiency improvement by
more than one order of magnitude, topological photonic power splitter with
arbitrary power splitting ratio. The topologically protected photonic waveguide
with the width degree of freedom could be beneficial for scaling up photonic
devices, which provides a new flexible platform to implement integrated
photonic networks on chip.Comment: 19 pages, 5 figure
Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs
Humans are able to form a complex mental model of the environment they move
in. This mental model captures geometric and semantic aspects of the scene,
describes the environment at multiple levels of abstractions (e.g., objects,
rooms, buildings), includes static and dynamic entities and their relations
(e.g., a person is in a room at a given time). In contrast, current robots'
internal representations still provide a partial and fragmented understanding
of the environment, either in the form of a sparse or dense set of geometric
primitives (e.g., points, lines, planes, voxels) or as a collection of objects.
This paper attempts to reduce the gap between robot and human perception by
introducing a novel representation, a 3D Dynamic Scene Graph(DSG), that
seamlessly captures metric and semantic aspects of a dynamic environment. A DSG
is a layered graph where nodes represent spatial concepts at different levels
of abstraction, and edges represent spatio-temporal relations among nodes. Our
second contribution is Kimera, the first fully automatic method to build a DSG
from visual-inertial data. Kimera includes state-of-the-art techniques for
visual-inertial SLAM, metric-semantic 3D reconstruction, object localization,
human pose and shape estimation, and scene parsing. Our third contribution is a
comprehensive evaluation of Kimera in real-life datasets and photo-realistic
simulations, including a newly released dataset, uHumans2, which simulates a
collection of crowded indoor and outdoor scenes. Our evaluation shows that
Kimera achieves state-of-the-art performance in visual-inertial SLAM, estimates
an accurate 3D metric-semantic mesh model in real-time, and builds a DSG of a
complex indoor environment with tens of objects and humans in minutes. Our
final contribution shows how to use a DSG for real-time hierarchical semantic
path-planning. The core modules in Kimera are open-source.Comment: 34 pages, 25 figures, 9 tables. arXiv admin note: text overlap with
arXiv:2002.0628
Single charge control of localized excitons in heterostructures with ferroelectric thin films and two-dimensional transition metal dichalcogenides
Single charge control of localized excitons (LXs) in two-dimensional
transition metal dichalcogenides (TMDCs) is crucial for potential applications
in quantum information processing and storage. However, traditional
electrostatic doping method with applying metallic gates onto TMDCs may cause
the inhomogeneous charge distribution, optical quench, and energy loss. Here,
by locally controlling the ferroelectric polarization of the ferroelectric thin
film BiFeO3 (BFO) with a scanning probe, we can deterministically manipulate
the doping type of monolayer WSe2 to achieve the p-type and n-type doping. This
nonvolatile approach can maintain the doping type and hold the localized
excitonic charges for a long time without applied voltage. Our work
demonstrated that ferroelectric polarization of BFO can control the charges of
LXs effectively. Neutral and charged LXs have been observed in different
ferroelectric polarization regions, confirmed by magnetic optical measurement.
Highly circular polarization degree about 90 % of the photon emission from
these quantum emitters have been achieved in high magnetic fields. Controlling
single charge of LXs in a non-volatile way shows a great potential for
deterministic photon emission with desired charge states for photonic long-term
memory.Comment: 13 pages, 5 figure
- …